Ch.12 Orthogonal Projection

Return to TOC

Onto a Line


Imagine "walking" from the tip of v\vec{v} to the line in an orthogonal fashion. Since the line is the span of some vector l={cs  cR}l=\{c\cdot\vec{s}\space|\space c\in\mathbb{R}\}, we're looking for some cpc_p so that cpsc_p\vec{s} is orthogonal to vcps\vec{v}-c_p\vec{s}

To solve, notice that vcps\vec{v}-c_p\vec{s} is orthogonal to s\vec{s} itself, so
s(vcps)=0svcps2=0cp=svss\vec{s}\cdot(\vec{v}-c_p\vec{s})=0\rightarrow \vec{s}\cdot\vec{v}-c_p\vec{s}^2=0\rightarrow c_p=\frac{\vec{s}\cdot\vec{v}}{\vec{s}\cdot\vec{s}}
Thus, we have the orhtogonal projection of v\vec{v} onto l=[s]l=[\vec{s}] is
proj[s](v)=vssss\text{proj}_{[\vec{s}]}(\vec{v})=\frac{\vec{v}\cdot\vec{s}}{\vec{s}\cdot\vec{s}}\cdot\vec{s}
This vector w=proj[s](v)\vec{w}=\text{proj}_{[\vec{s}]}(\vec{v}) is the only vector in the line [s][\vec{s}] such that vw\vec{v}-\vec{w} is orthogonal to any vector in [s][\vec{s}]

Example 12.1

The projection of this R3\mathbb{R}^3 vector into the line
v=(311)  L={c(121)  cR}\vec{v}=\begin{pmatrix}3\\1\\1\end{pmatrix}\space\space L=\{c\cdot\begin{pmatrix}1\\-2\\1\end{pmatrix}\space|\space c\in\mathbb{R}\}
is the vector
projLv=(311)(121)(121)(121)(121)=26(121)=(1/32/31/3)\text{proj}_L{\vec{v}}=\frac{\begin{pmatrix}3\\1\\1\end{pmatrix}\cdot\begin{pmatrix}1\\-2\\1\end{pmatrix}}{\begin{pmatrix}1\\-2\\1\end{pmatrix}\cdot\begin{pmatrix}1\\-2\\1\end{pmatrix}}\cdot\begin{pmatrix}1\\-2\\1\end{pmatrix}=\frac{2}{6}\begin{pmatrix}1\\-2\\1\end{pmatrix}=\begin{pmatrix}1/3\\-2/3\\1/3\end{pmatrix}


Gram-Schmidt Orthogonalization

Notice how v\vec{v} can be decomposed into
v=proj[s](v)+(vproj[s](v))\vec{v}=\text{proj}_{[\vec{s}]}(\vec{v})+\left(\vec{v}-\text{proj}_{[\vec{s}]}(\vec{v})\right)
These are orthogonal, and can be seen as "non-interacting," i.e. linearly independent

Vectors v1,...,vkR\vec{v}_1,...,\vec{v}_k\in\mathbb{R} are mutually orthogonal if any pair of them are orthogonal
i.e. for any iji\ne jvi\vec{v}_i and vj\vec{v}_j are orthogonal
For example, the standard basis vectors are mutually orthogonal

If the vectors in a set {v1,...,vk}Rn\{\vec{v}_1,...,\vec{v}_k\}\subset\mathbb{R}^n are mutually orthogonal and nonzero, then the set is linearly independent.

Proof

Consider c1v1++ckvk=0c_1\vec{v}_1+\cdots+c_k\vec{v}_k=0. For i{1,...,k}i\in\{1,...,k\}, taking the dot product on both sides gives
vi(c1v1++ckvk)=vi0ci(vivi)=0\vec{v}_i\cdot(c_1\vec{v}_1+\cdots+c_k\vec{v}_k)=\vec{v}_i\cdot0\\ \implies c_i(\vec{v}_i\cdot\vec{v}_i)=0
Since vi0\vec{v}_i\ne0, we must have vivi0\vec{v}_i\cdot\vec{v}_i\ne0, therefore ci=0c_i=0.
Since all ci=0c_i=0, the set is linearly independent.

A corollary of this is that kk mutually orthogonal vectors of a kk-dimensional vector space is a basis, because a subset of kk linearly independent vectors of a kk-dimensional space is a basis.
An orthogonal basis for a vector space is a basis of mutually orthogonal vectors

Gram-Schmidt Orthogonalization

If β1,...,βk\langle\vec{\beta}_1,...,\vec{\beta}_k\rangle is a basis for a subspace of Rn\mathbb{R}^n, then the vectors
κ1=β1κ2=β2proj[κ1](β2)κ3=β3proj[κ1](β3)proj[κ2](β3)κk=βkproj[κ1](βk)β3proj[κk1](βk)\begin{array}{rcl} \vec{\kappa}_1&=&\vec{\beta}_1\\ \vec{\kappa}_2&=&\vec{\beta}_2-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_2)\\ \vec{\kappa}_3&=&\vec{\beta}_3-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_3)-\text{proj}_{[\vec{\kappa}_2]}(\vec{\beta}_3)\\ &\vdots\\ \vec{\kappa}_k&=&\vec{\beta}_k-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_k)-\cdots-\vec{\beta}_3-\text{proj}_{[\vec{\kappa}_{k-1}]}(\vec{\beta}_k) \end{array}
form an orthogonal basis for the same subspace. Moreover,
span(κ1,...,κi)=span(β1,...,βi)\text{span}({\vec{\kappa}_1,...,\vec{\kappa}_i})=\text{span}({\vec{\beta}_1,...,\vec{\beta}_i}) for all i=1,...ki=1,...k

Proof

We use induction to show that each κi\kappa_i:

Case i=1i=1: this is trivial
Case i=2i=2: we have
κ2=β2proj[κ1](β2)=β2β2κ1κ1κ1κ1=β2β2κ1κ1κ1β1\vec{\kappa}_2=\vec{\beta}_2-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_2)=\vec{\beta}_2-\frac{\vec{\beta}_2\cdot\vec{\kappa}_1}{\vec{\kappa}_1\cdot\vec{\kappa}_1}\cdot\vec{\kappa}_1=\vec{\beta}_2-\frac{\vec{\beta}_2\cdot\vec{\kappa}_1}{\vec{\kappa}_1\cdot\vec{\kappa}_1}\cdot\vec{\beta}_1
This is nonzero because the β\vec{\beta}'s are linearly independent, is clearly in the span β1,β2\langle\vec{\beta}_1,\vec{\beta}_2\rangle, and is orthogonal to κ1\vec{\kappa}_1 because the projection is orthogonal
Case i=3i=3: we have
κ3=β3proj[κ1](β3)proj[κ2](β3)=β3β3κ1κ1κ1κ1β3κ2κ2κ2κ2=β3β2κ1κ1κ1β1β3κ2κ2κ2(β2β2κ1κ1κ1β1)\vec{\kappa}_3=\vec{\beta}_3-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_3)-\text{proj}_{[\vec{\kappa}_2]}(\vec{\beta}_3)\\=\vec{\beta}_3-\frac{\vec{\beta}_3\cdot\vec{\kappa}_1}{\vec{\kappa}_1\cdot\vec{\kappa}_1}\cdot\vec{\kappa}_1-\frac{\vec{\beta}_3\cdot\vec{\kappa}_2}{\vec{\kappa}_2\cdot\vec{\kappa}_2}\cdot\vec{\kappa}_2\\=\vec{\beta}_3-\frac{\vec{\beta}_2\cdot\vec{\kappa}_1}{\vec{\kappa}_1\cdot\vec{\kappa}_1}\cdot\vec{\beta}_1-\frac{\vec{\beta}_3\cdot\vec{\kappa}_2}{\vec{\kappa}_2\cdot\vec{\kappa}_2}\cdot(\vec{\beta}_2-\frac{\vec{\beta}_2\cdot\vec{\kappa}_1}{\vec{\kappa}_1\cdot\vec{\kappa}_1}\cdot\vec{\beta}_1)
This is nonzero and in the span β1,β2,β3\langle\vec{\beta}_1,\vec{\beta}_2,\vec{\beta}_3\rangle becuase they are linearly independent, and it is not hard to check that this is orthogonal to κ1\vec{\kappa}_1 and κ2\vec{\kappa}_2
Continue in this fashion to prove for all i=1,...,ki=1,...,k

Note that if β1,...,βk\langle\vec{\beta}_1,...,\vec{\beta}_k\rangle is already orthogonal, the process just gives κi=βi\vec{\kappa}_i=\vec{\beta}_i for i=1,...,ki=1,...,k

Example 12.2

Derive an orthogonal basis K=κ1,κ2K=\langle\vec{\kappa}_1,\vec{\kappa}_2\rangle for the basis
B=(12),(13)B=\langle\begin{pmatrix}1\\2\end{pmatrix},\begin{pmatrix}1\\3\end{pmatrix}\rangle


First, κ1=β1=(12)\vec{\kappa}_1=\vec{\beta}_1=\begin{pmatrix}1\\2\end{pmatrix}
Then,
κ2=β2proj[κ1](β2)=(13)(13)(12)(12)(12)(12)=(2/51/5)\vec{\kappa}_2=\vec{\beta}_2-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_2)=\begin{pmatrix}1\\3\end{pmatrix}-\frac{\begin{pmatrix}1\\3\end{pmatrix}\cdot\begin{pmatrix}1\\2\end{pmatrix}}{\begin{pmatrix}1\\2\end{pmatrix}\cdot\begin{pmatrix}1\\2\end{pmatrix}}\cdot\begin{pmatrix}1\\2\end{pmatrix}=\begin{pmatrix}-2/5\\1/5\end{pmatrix}
Thus, K=(12),(2/51/5)K=\langle\begin{pmatrix}1\\2\end{pmatrix},\begin{pmatrix}-2/5\\1/5\end{pmatrix}\rangle
Note that because (12)(2/51/5)=0\begin{pmatrix}1\\2\end{pmatrix}\cdot\begin{pmatrix}-2/5\\1/5\end{pmatrix}=0, they are orthogonal

Example 12.3

Derive an orthogonal basis KK for
B=(112),(121),(031)B=\langle\begin{pmatrix}1\\1\\2\end{pmatrix},\begin{pmatrix}-1\\2\\1\end{pmatrix},\begin{pmatrix}0\\3\\-1\end{pmatrix}\rangle


κ1=β1=(112) \vec{\kappa}_1=\vec{\beta}_1=\begin{pmatrix}1\\1\\2\end{pmatrix}
κ2=β2proj[κ1](β2)=(121)(121)(112)(112)(112)(112)=(121)12(112)=(3/23/20) \vec{\kappa}_2=\vec{\beta}_2-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_2)=\begin{pmatrix}-1\\2\\1\end{pmatrix}-\frac{\begin{pmatrix}-1\\2\\1\end{pmatrix}\cdot\begin{pmatrix}1\\1\\2\end{pmatrix}}{\begin{pmatrix}1\\1\\2\end{pmatrix}\cdot\begin{pmatrix}1\\1\\2\end{pmatrix}}\cdot\begin{pmatrix}1\\1\\2\end{pmatrix}=\begin{pmatrix}-1\\2\\1\end{pmatrix}-\frac{1}{2}\begin{pmatrix}1\\1\\2\end{pmatrix}=\begin{pmatrix}-3/2\\3/2\\0\end{pmatrix}
κ3=β3proj[κ1](β3)proj[κ2](β3)=(031)(031)(112)(112)(112)(112)(031)(3/23/20)(3/23/20)(3/23/20)(3/23/20) =(031)16(112)9/29/2(3/23/20)=(4/34/34/3) \vec{\kappa}_3=\vec{\beta}_3-\text{proj}_{[\vec{\kappa}_1]}(\vec{\beta}_3)-\text{proj}_{[\vec{\kappa}_2]}(\vec{\beta}_3)= \begin{pmatrix}0\\3\\-1\end{pmatrix}-\frac{\begin{pmatrix}0\\3\\-1\end{pmatrix}\cdot\begin{pmatrix}1\\1\\2\end{pmatrix}}{\begin{pmatrix}1\\1\\2\end{pmatrix}\cdot\begin{pmatrix}1\\1\\2\end{pmatrix}}\cdot\begin{pmatrix}1\\1\\2\end{pmatrix}-\frac{\begin{pmatrix}0\\3\\-1\end{pmatrix}\cdot\begin{pmatrix}-3/2\\3/2\\0\end{pmatrix}}{\begin{pmatrix}-3/2\\3/2\\0\end{pmatrix}\cdot\begin{pmatrix}-3/2\\3/2\\0\end{pmatrix}}\cdot\begin{pmatrix}-3/2\\3/2\\0\end{pmatrix}\\\space\\ =\begin{pmatrix}0\\3\\-1\end{pmatrix}-\frac{1}{6}\begin{pmatrix}1\\1\\2\end{pmatrix}-\frac{9/2}{9/2}\begin{pmatrix}-3/2\\3/2\\0\end{pmatrix}=\begin{pmatrix}4/3\\4/3\\-4/3\end{pmatrix}
So in summary,
K=(112),(3/23/20),(4/34/34/3)K=\langle\begin{pmatrix}1\\1\\2\end{pmatrix},\begin{pmatrix}-3/2\\3/2\\0\end{pmatrix},\begin{pmatrix}4/3\\4/3\\-4/3\end{pmatrix}\rangle

The orthogonal basis KK can be normalized to have length 11, making it an orthonormal basis

A family of vectors in Rn\mathbb{R}^n is orthonormal if they are mutually orthogonal and all have length 1.
In other words, for i{1,...k}i\in\{1,...k\} with i<ji<j, {β1,...,βl}Rn\{\vec{\beta}_1,...,\vec{\beta}_l\}\subseteq\mathbb{R}^n is orthonormal if βiβj=0\vec{\beta}_i\cdot\vec{\beta}_j=0 and βiβi=1\vec{\beta}_i\cdot\vec{\beta}_i=1
If it is also a basis, then it is an orthonormal basis

Summary of Gram-Schmidt process

Proof Since BMB_M is a basis for MM, we can write v=c1b1++ckbk\vec{v}=c_1\vec{b}_1+\cdots+c_k\vec{b}_k with c1,...,ckRc_1,...,c_k\in\mathbb{R}. To find cic_i take the dot product with bi\vec{b}_i so vbi=(c1b1++cibi++ckbk)bi=c1b1b1++cibibi++ckbkbk=ci\begin{array}{lcl}\vec{v}\cdot\vec{b}_i&=&(c_1\vec{b}_1+\cdots+c_i\vec{b}_i+\cdots+c_k\vec{b}_k)\cdot\vec{b}_i\\&=&c_1\vec{b}_1\cdot\vec{b}_1+\cdots+c_i\vec{b}_i\cdots\vec{b}_i+\cdots+c_k\vec{b}_k\cdot\vec{b}_k\\&=&c_i\end{array} since bibj=0\vec{b}_i\cdot\vec{b}_j=0 for iji\ne j and bibi=1\vec{b}_i\cdot\vec{b}_i=1

We will say wRn\vec{w}\in\mathbb{R}^n is orthogonal to subspace MM of Rn\mathbb{R}^n if it is orthogonal to every vector vM\vec{v}\in M, i.e. wv=0\vec{w}\cdot\vec{v}=0 for all vM\vec{v}\in M
a) The only vector vM\vec{v}\in M that is orthogonal to MM is 0\vec{0}
b) If w1\vec{w}_1 and w2\vec{w}_2 are orthogonal to MM, then any c1w1+c2w2c_1\vec{w}_1+c_2\vec{w}_2 with c1,c2Rc_1,c_2\in\mathbb{R} is also orthogonal to MM
c) If BM=β1,...,βkB_M=\langle\vec{\beta}_1,...,\vec{\beta}_k\rangle is a basis for MM, then w\vec{w} is orthogonal to MM iff wβi=0\vec{w}\cdot\vec{\beta}_i=0 for all i=1,...,ki=1,...,k

Proofs

a) We must have v\vec{v} orthogonal to itself, so
vv=v2=0v=0\vec{v}\cdot\vec{v}=|\vec{v}|^2=0\implies \vec{v}=0
b) We have w1v=0\vec{w}_1\cdot\vec{v}=0 and w2v=0\vec{w}_2\cdot\vec{v}=0 for all vM\vec{v}\in M, so
(c1w1+c2w2)v=c1w1v+c2w2v=0(c_1\vec{w}_1+c_2\vec{w}_2)\cdot\vec{v}=c_1\vec{w}_1\cdot\vec{v}+c_2\vec{w}_2\cdot\vec{v}=0
c) If wRn\vec{w}\in\mathbb{R}^n is orthogonal to MM, then it is orthogonal to every biM\vec{b}_i\in M. Conversely, assume wRn\vec{w}\in\mathbb{R}^n is such that wbi=0\vec{w}\cdot\vec{b}_i=0 for all i=1,...,ki=1,...,k.
Any vector vM\vec{v}\in M can be represented as v=c1b1++ckbk\vec{v}=c_1\vec{b}_1+\cdots+c_k\vec{b}_k, so
wv=w(c1b1++ckbk)=c1wb1++ckwbk=0\vec{w}\cdot\vec{v}=\vec{w}\cdot(c_1\vec{b}_1+\cdots+c_k\vec{b}_k)=c_1\vec{w}\cdot\vec{b}_1+\cdots+c_k\vec{w}\cdot\vec{b}_k=0


Onto a Subspace

This is a generalization of the projection onto a line.

Let MM be a subspace of Rn\mathbb{R}^n, then for every vector wRn\vec{w}\in\mathbb{R}^n, there exists a unique vector vM\vec{v}\in M such that wv\vec{w}-\vec{v} is orthogonal to MM.
We denote v=projM(w)\vec{v}=\text{proj}_M(\vec{w}) and call it the orthogonal projection of w\vec{w} on MM.
If BM=b1,...,bkB_M=\langle\vec{b}_1,...,\vec{b}_k\rangle is an orthogonal basis for MM, then
projM(w)=(wb1)b1++(wbk)bk\text{proj}_M(\vec{w})=(\vec{w}\cdot\vec{b}_1)\vec{b}_1+\cdots+(\vec{w}\cdot\vec{b}_k)\vec{b}_k

Proof

The vector v=(wb1)b1++(wbk)bk\vec{v}=(\vec{w}\cdot\vec{b}_1)\vec{b}_1+\cdots+(\vec{w}\cdot\vec{b}_k)\vec{b}_k is such that wv\vec{w}-\vec{v} is orthogonal to MM. Since vM\vec{v}\in M and BMB_M is an orthogonal basis, v=(vb1)b1++(vbk)bk\vec{v}=(\vec{v}\cdot\vec{b}_1)\vec{b}_1+\cdots+(\vec{v}\cdot\vec{b}_k)\vec{b}_k.
Therefore,
vb1=wb1, ..., vbk=wbk\vec{v}\cdot\vec{b}_1=\vec{w}\cdot\vec{b}_1,\space...,\space\vec{v}\cdot\vec{b}_k=\vec{w}\cdot\vec{b}_k
This implies (wv)bi=0(\vec{w}-\vec{v})\vec{b}_i=0 for all i=1,...,ki=1,...,k, so by c) from before wv\vec{w}-\vec{v} is orthogonal to MM.
Now suppose v1,v2M\vec{v}_1,\vec{v}_2\in M are such that wv1\vec{w}-\vec{v}_1 and wv2\vec{w}-\vec{v}_2 are orthogonal to MM. By b) from before, (wv1)(wv2)=v2v1(\vec{w}-\vec{v}_1)-(\vec{w}-\vec{v}_2)=\vec{v}_2-\vec{v}_1 is orthogonal to MM, but v2v1M\vec{v}_2-\vec{v}_1\in M, so by a) v2v1=0v2=v1\vec{v}_2-\vec{v}_1=0\implies\vec{v}_2=\vec{v}_1
proving its uniqueness

Let MM be a subspace of Rn\mathbb{R}^n. The map projM:RnM,wprojM(w)\text{proj}_M:\mathbb{R}^n\to M,\vec{w}\mapsto\text{proj}_M(\vec{w}) is a linear map.

Proof

We must show that for w1,w2Rn\vec{w}_1,\vec{w}_2\in\mathbb{R}^n
projM(c1w1+c2w2)=c1projM(w1)+c2projM(w2)\text{proj}_M(c_1\vec{w}_1+c_2\vec{w}_2)=c_1\text{proj}_M(\vec{w}_1)+c_2\text{proj}_M(\vec{w}_2)
Both w1projM(w1)\vec{w}_1-\text{proj}_M(\vec{w}_1) and w2projM(w2)\vec{w}_2-\text{proj}_M(\vec{w}_2) are orthogonal to MM. Therefore, the linear combination of those vectors
c1(w1projM(w1))+c2(w2projM(w2))=(c1w1+c2w2)(c1projM(w1)+c2projM(w2))c_1(\vec{w}_1-\text{proj}_M(\vec{w}_1))+c_2(\vec{w}_2-\text{proj}_M(\vec{w}_2))\\=(c_1\vec{w}_1+c_2\vec{w}_2)-(c_1\text{proj}_M(\vec{w}_1)+c_2\text{proj}_M(\vec{w}_2))
is also orthogonal to MM
Since c1projM(w1)+c2projM(w2)Mc_1\text{proj}_M(\vec{w}_1)+c_2\text{proj}_M(\vec{w}_2)\in M, we must have
c1projM(w1)+c2projM(w2)=projM(c1w1+c2w2)c_1\text{proj}_M(\vec{w}_1)+c_2\text{proj}_M(\vec{w}_2)=\text{proj}_M(c_1\vec{w}_1+c_2\vec{w}_2)


The orthogonal complement of a subspace MM of Rn\mathbb{R}^n is
M={wRn  w is orthogonal to M}M^{\perp}=\{\vec{w}\in\mathbb{R}^n\space|\space\vec{w}\text{ is orthogonal to } M\}
(read "MM perp")

Example 12.4

Find the orthogonal compoenent of the plane in R3\mathbb{R}^3
P={(xyz)  3x+2yz=0}P=\{\begin{pmatrix}x\\y\\z\end{pmatrix}\space|\space 3x+2y-z=0\}


First, find a basis for PP
B=(103),(012)B=\langle\begin{pmatrix}1\\0\\3\end{pmatrix},\begin{pmatrix}0\\1\\2\end{pmatrix}\rangle

Steps We have z=3x+2yz=3x+2y, so P={(103)x+(012)y  x,yR}P=\left\{\begin{pmatrix}1\\0\\3\end{pmatrix}x+\begin{pmatrix}0\\1\\2\end{pmatrix}y\space|\space x,y\in\mathbb{R}\right\}

A v\vec{v} that is orthogonal to every vector in BB is orthogonal to every vector in span(B)=P\text{span}(B)=P
So this gives two conditions:
(103)(v1v2v3)=0(012)(v1v2v3)=0\begin{array}{cc}\begin{pmatrix}1\\0\\3\end{pmatrix}\cdot\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}=0&\begin{pmatrix}0\\1\\2\end{pmatrix}\cdot\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}=0\end{array}
This gives a linear system
P={(v1v2v3)  (103012)(v1v2v3)=(00)}P^{\perp}=\{\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}\space|\space\begin{pmatrix}1&0&3\\0&1&2\end{pmatrix}\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}=\begin{pmatrix}0\\0\end{pmatrix}\}
we therefore must find the nullspace of the matrix
P={(321)t  tR}P^\perp=\{\begin{pmatrix}-3\\-2\\1\end{pmatrix}t\space|\space t\in\mathbb{R}\}

For a subspace MM and the orthogonal complement MM^\perp,

  1. MM^\perp is itself a subspace
  2. MM={0}M\cap M^\perp=\{\vec{0}\}
  3. For every wRn\vec{w}\in\mathbb{R}^n, wprojM(w)M\vec{w}-\text{proj}_M(\vec{w})\in M^\perp
  4. The span of MMM^\perp\cup M is all of Rn\mathbb{R}^n
  5. If dimension(M)=k\text{dimension}(M)=k, then dimension(M)=nk\text{dimension}(M^\perp)=n-k
Proofs
  1. 0M\vec{0}\in M and 0v=0\vec{0}\cdot\vec{v}=0 for all vM\vec{v}\in M so 0M\vec{0}\in M^\perp as well.
    From b) from before, MM^\perp is closed under vector addition and scalar multiplication. Thus, MM^\perp is a subspace of Rn\mathbb{R}^n

  1. From part a), the only vector in MM that is orthogonal to MM is 0\vec{0}, so MM={0}M\cap M^\perp=\{\vec{0}\}

  1. By definition of projM(w)\text{proj}_M(\vec{w}), wprojM(w)\vec{w}-\text{proj}_M(\vec{w}) is orthogonal to MM, so wprojM(w)M\vec{w}-\text{proj}_M(\vec{w})\in M^\perp

  1. For any wRn\vec{w}\in\mathbb{R}^n, we have w=(wprojM(w))+projM(w)\vec{w}=(\vec{w}-\text{proj}_M(\vec{w}))+\text{proj}_M(\vec{w})
    but wprojM(w)M\vec{w}-\text{proj}_M(\vec{w})\in M^\perp and projM(w)M\text{proj}_M(\vec{w})\in M

  1. First, suppose dimension(M)=l\text{dimension}(M^\perp)=l
    Then choose orthonormla bases BM=b1,...,bkB_M=\langle\vec{b}_1,...,\vec{b}_k\rangle of MM and BM=bk+1,...,bk+lB_{M^\perp}=\langle\vec{b}_{k+1},...,\vec{b}_{k+l}\rangle of MM^\perp
    BMB_M spans MM and BMB_{M^\perp} spans Mb1,...,bk,bk+1,...,bk+lM^\perp\implies\langle\vec{b}_1,...,\vec{b}_k,\vec{b}_{k+1},...,\vec{b}_{k+l}\rangle spans Rn\mathbb{R}^n by (3)
    We consider bibj\vec{b}_i\cdot\vec{b}_j for i<ji<j:
    If jkj\le k, since b1,...,bk\langle\vec{b}_1,...,\vec{b}_k\rangle is orthonormal, bibj=0\vec{b}_i\cdot\vec{b}_j=0.
    If k+1ik+1\le i, since bk+1,...,bk+l\langle\vec{b}_{k+1},...,\vec{b}_{k+l}\rangle is orthonormal, bibj=0\vec{b}_i\cdot\vec{b}_j=0.
    If iki\le k and k+1jk+1\le j, then biM\vec{b}_i\in M and bjM\vec{b}_j\in M^\perp, so they are perpendicular, thus bibj=0\vec{b}_i\cdot\vec{b}_j=0.
    So, the family {b1,...,bk,bk+1,...,bk+l}\{\vec{b}_1,...,\vec{b}_k,\vec{b}_{k+1},...,\vec{b}_{k+l}\} is linearly independent, nonzero, and span Rn\mathbb{R}^n. Therefore, k+l=nl=nkk+l=n\implies l=n-k, finishing the proof.

If MM is a subspace of Rn\mathbb{R}^n, then MM is the orthogonal complement of MM^\perp, i.e. (M)=M(M^\perp)^\perp=M
For every wRn\vec{w}\in\mathbb{R}^n,
w=projM(w)+projM(w)\vec{w}=\text{proj}_M(\vec{w})+\text{proj}_{M^\perp}(\vec{w})

Proof

From the definition of MM^\perp, if vM\vec{v}\in M then v\vec{v} is orthogonal to every vector in MM^\perp, so v(M)\vec{v}\in(M^\perp)^\perp, and M(M)M\subseteq(M^\perp)^\perp.
Furthermore, we know that dimension(M)+dimension(M)=n\text{dimension}(M)+\text{dimension}(M^\perp)=n and dimension(M)+dimension((M))=n\text{dimension}(M^\perp)+\text{dimension}((M^\perp)^\perp)=n, so dimension(M)=dimension((M))\text{dimension}(M)=\text{dimension}((M^\perp)^\perp)
With those two facts, we can conclude that M=(M)M=(M^\perp)^\perp
For the second part, define w=wprojM(w)\vec{w}^\perp=\vec{w}-\text{proj}_M(\vec{w}). Since ww=projM(w)M=(M)\vec{w}-\vec{w}^\perp=\text{proj}_M(\vec{w})\in M=(M^\perp)^\perp, we have that ww\vec{w}-\vec{w}^\perp is orthogonal to MM^\perp, with wM\vec{w}\in M^\perp, so ww=wprojM(w)w=projM(w)\vec{w}-\vec{w}^\perp=\vec{w}-\text{proj}_{M^\perp}(\vec{w})\implies\vec{w}^\perp=\text{proj}_{M^\perp}(\vec{w})
Finally, w=w+projM(w)=projM(w)+projM(w)\vec{w}=\vec{w}^\perp+\text{proj}_M(\vec{w})=\text{proj}_{M^\perp}(\vec{w})+\text{proj}_M(\vec{w})

Given a subspace MRnM\subseteq\mathbb{R}^n, how can we compute projM(w)\text{proj}_M(\vec{w}) of a vector wRn\vec{w}\in\mathbb{R}^n?
We will suppose the basis for MM is B=b1,...,bkB=\langle\vec{b}_1,...,\vec{b}_k\rangle
If BB is an orthonormal basis, then we know
projM(w)=(wb1)b1++(wbk)bk=UUTw\text{proj}_M(\vec{w})=(\vec{w}\cdot\vec{b}_1)\vec{b}_1+\cdots+(\vec{w}\cdot\vec{b}_k)\vec{b}_k=UU^T\vec{w}
or equivalently
RepBM(projM(w))=(wb1wbk)\text{Rep}_{B_M}(\text{proj}_M(\vec{w}))=\begin{pmatrix}\vec{w}\cdot\vec{b}_1\\\vdots\\\vec{w}\cdot\vec{b}_k\end{pmatrix}

If BB is an orthogonal basis, then
b1b1,...,bkbk\langle\frac{\vec{b}_1}{|\vec{b}_1|},...,\frac{\vec{b}_k}{|\vec{b}_k|}\rangle
is orthonormal.

If BB isn't orthogonal, you could use Gram-Schmidt, but we use a more convenient formula:

Let MRnM\subseteq\mathbb{R}^n be a subspace with basis b1,...,bk\langle\vec{b}_1,...,\vec{b}_k\rangle and let AA be the matrix whose columns are the bi\vec{b}_i's. Then
projM(v)=c1b1++ckbk\text{proj}_M(\vec{v})=c_1\vec{b}_1+\cdots+c_k\vec{b}_k
where the cic_i's are the entries of the vector
(ATA)1ATv(A^TA)^{-1}A^T\cdot\vec{v}
or equivalently,
projM(v)=A(ATA)1ATv\text{proj}_M(\vec{v})=A(A^TA)^{-1}A^T\cdot\vec{v}

Proof

Given: b1,...,bk\langle\vec{b}_1,...,\vec{b}_k\rangle is the basis of MRnM\subseteq\mathbb{R}^n and AA is an n×kn\times k matrix with column ii being bi\vec{b}_i
projM(v)MprojM(v)=c1b1++ckbk=Ac\text{proj}_M(\vec{v})\in M\implies\text{proj}_M(\vec{v})=c_1\vec{b}_1+\cdots+c_k\vec{b}_k=A\vec{c} where c=(c1ck)\vec{c}=\begin{pmatrix}c_1\\\vdots\\c_k\end{pmatrix}
vprojM(v)\vec{v}-\text{proj}_M(\vec{v}) is orthogonal to every bi\vec{b}_i, which is every row of ATAT(vprojM(v))=0A^T\implies A^T(\vec{v}-\text{proj}_M(\vec{v}))=0
AT(vAc)=ATvATAc=0c=(ATA)1ATv\implies A^T(\vec{v}-A\vec{c})=A^T\vec{v}-A^TA\vec{c}=0\implies\vec{c}=(A^TA)^{-1}A^T\vec{v} (check ATAA^TA is invertible)
Thus, projM(v)=Ac=A(ATA)1ATv\text{proj}_M(\vec{v})=A\vec{c}=A(A^TA)^{-1}A^T\vec{v}

Note that (ATA)1A1(AT)1(A^TA)^{-1}\ne A^{-1}(A^T)^{-1} because AA is not square

Example 12.5

Project v=(111)\vec{v}=\begin{pmatrix}1\\-1\\1\end{pmatrix} onto the plane P={(xyz)  x+z=0}P=\{\begin{pmatrix}x\\y\\z\end{pmatrix}\space|\space x+z=0\}


A basis for PP is (010),(101)\langle\begin{pmatrix}0\\1\\0\end{pmatrix},\begin{pmatrix}1\\0\\-1\end{pmatrix}\rangle so
A=(011001) AT=(010101)A=\begin{pmatrix}0&1\\1&0\\0&-1\end{pmatrix}\space A^T=\begin{pmatrix}0&1&0\\1&0&-1\end{pmatrix}
Now, we simply compute A(ATA)1ATvA(A^TA)^{-1}A^T\vec{v}
ATA=(1002)A^TA=\begin{pmatrix}1&0\\0&2\end{pmatrix}
(ATA)1=(1001/2)(A^TA)^{-1}=\begin{pmatrix}1&0\\0&1/2\end{pmatrix}
(ATA)1AT=(0101/201/2)(A^TA)^{-1}A^T=\begin{pmatrix}0&1&0\\1/2&0&-1/2\end{pmatrix}
A(ATA)1AT=(1/201/20101/201/2)A(A^TA)^{-1}A^T=\begin{pmatrix}1/2&0&-1/2\\0&1&0\\-1/2&0&1/2\end{pmatrix}
Finally
projP(v)=(1/201/20101/201/2)(111)=(010)\text{proj}_P(\vec{v})=\begin{pmatrix}1/2&0&-1/2\\0&1&0\\-1/2&0&1/2\end{pmatrix}\begin{pmatrix}1\\-1\\1\end{pmatrix}=\begin{pmatrix}0\\-1\\0\end{pmatrix}

Given a subspace MRnM\subseteq\mathbb{R}^n, the distance from wRn\vec{w}\in\mathbb{R}^n to MM is the smallest possible distance from w\vec{w} to a point on MM
The distance from w\vec{w} to MM is wprojM(w)|\vec{w}-\text{proj}_M(\vec{w})|, or equivlaently, wvwprojM(w)|\vec{w}-\vec{v}|\ge|\vec{w}-\text{proj}_M(\vec{w})| for all vM\vec{v}\in M

Proof

We know w=projM(w)+projM(v)\vec{w}=\text{proj}_M(\vec{w})+\text{proj}_{M^\perp}(\vec{v})
wv=(projM(w)v)+projM(v)\implies\vec{w}-\vec{v}=(\text{proj}_M(\vec{w})-\vec{v})+\text{proj}_{M^\perp}(\vec{v})
Since vM\vec{v}\in M, projM(w)vM\text{proj}_M(\vec{w})-\vec{v}\in M, so is orthogonal to wprojM(w)=projM(w)M\vec{w}-\text{proj}_M(\vec{w})=\text{proj}_{M^\perp}(\vec{w})\in M^\perp. Therefore, by Pythagorean Theorem,
wv2=projM(w)v2+wprojM(w)2|\vec{w}-\vec{v}|^2=|\text{proj}_M(\vec{w})-\vec{v}|^2+|\vec{w}-\text{proj}_M(\vec{w})|^2
So wvwprojM(w)|\vec{w}-\vec{v}|\ge|\vec{w}-\text{proj}_M(\vec{w})|, with equality when projM(w)=v\text{proj}_M(\vec{w})=\vec{v}